13 research outputs found

    APAUNet: Axis Projection Attention UNet for Small Target in 3D Medical Segmentation

    Full text link
    In 3D medical image segmentation, small targets segmentation is crucial for diagnosis but still faces challenges. In this paper, we propose the Axis Projection Attention UNet, named APAUNet, for 3D medical image segmentation, especially for small targets. Considering the large proportion of the background in the 3D feature space, we introduce a projection strategy to project the 3D features into three orthogonal 2D planes to capture the contextual attention from different views. In this way, we can filter out the redundant feature information and mitigate the loss of critical information for small lesions in 3D scans. Then we utilize a dimension hybridization strategy to fuse the 3D features with attention from different axes and merge them by a weighted summation to adaptively learn the importance of different perspectives. Finally, in the APA Decoder, we concatenate both high and low resolution features in the 2D projection process, thereby obtaining more precise multi-scale information, which is vital for small lesion segmentation. Quantitative and qualitative experimental results on two public datasets (BTCV and MSD) demonstrate that our proposed APAUNet outperforms the other methods. Concretely, our APAUNet achieves an average dice score of 87.84 on BTCV, 84.48 on MSD-Liver and 69.13 on MSD-Pancreas, and significantly surpass the previous SOTA methods on small targets.Comment: Accepted by ACCV202

    Mustango: Toward Controllable Text-to-Music Generation

    Full text link
    With recent advancements in text-to-audio and text-to-music based on latent diffusion models, the quality of generated content has been reaching new heights. The controllability of musical aspects, however, has not been explicitly explored in text-to-music systems yet. In this paper, we present Mustango, a music-domain-knowledge-inspired text-to-music system based on diffusion, that expands the Tango text-to-audio model. Mustango aims to control the generated music, not only with general text captions, but from more rich captions that could include specific instructions related to chords, beats, tempo, and key. As part of Mustango, we propose MuNet, a Music-Domain-Knowledge-Informed UNet sub-module to integrate these music-specific features, which we predict from the text prompt, as well as the general text embedding, into the diffusion denoising process. To overcome the limited availability of open datasets of music with text captions, we propose a novel data augmentation method that includes altering the harmonic, rhythmic, and dynamic aspects of music audio and using state-of-the-art Music Information Retrieval methods to extract the music features which will then be appended to the existing descriptions in text format. We release the resulting MusicBench dataset which contains over 52K instances and includes music-theory-based descriptions in the caption text. Through extensive experiments, we show that the quality of the music generated by Mustango is state-of-the-art, and the controllability through music-specific text prompts greatly outperforms other models in terms of desired chords, beat, key, and tempo, on multiple datasets

    Urban sound analysis and synthesis using artificial intelligence

    No full text
    With the advent of artificial intelligence and machine learning, multiple industries have gone through different kinds of revolution. For example, convolutional neural networks has drastically changed the conventional ways for computer to capture features of image and video also known as computer vision. In the audio domain, artificial intelligence has been widely used in areas such as sound classification, speech to text conversion etc. In this work, I will mainly focus on the use of artificial intelligence in urban sound analysis and processing which was shown to have much better performance than conventional methods. Unlike images or videos, analog sound has to be sampled and quantized in order to be stored in digital format. In this work, only digital sound is concerned since neural networks can only pick up digital values. Digital sound also has its unique sets of features such as sampling frequency, bit depth. Various research work has also utilized sound features in the frequency domain such as bandwidth. One important feature of digital sound, sampling frequency, is normally beyond 8kHz. This would bring up some issues in audio processing since one second of audio would contain at least thousands of discrete digital values. In order to process large amounts of sound samples in a sequential manner, the focus of this work will be on recurrent neural networks, a type of network structure with its own memory mechanism that can deal with long-term dependency. In this work I will focus on two topics: audio captioning and audio synthesis. Firstly, captioning using AI has been widely used in the field of computer vision. Meanwhile, audio captioning would be useful for those people who may have hearing issues to perceive sound information. Secondly, audio data collection could be time-consuming and costly. However by learning audio patterns and inter-dependencies, sound synthesis would generate sound more efficiently.Bachelor of Engineering (Electrical and Electronic Engineering

    Exogenous supplement of N-acetylneuraminic acid improves macrophage reverse cholesterol transport in apolipoprotein E-deficient mice

    No full text
    Abstract Background N-acetylneuraminic acid (NANA) is the major form of sialic acid in mammals, and the plasma NANA level is increased in patients with cardiovascular diseases. Exogenous supplement of NANA has been demonstrated to reduce hyperlipidaemia and the formation of atherosclerotic lesions; however, the underlying mechanisms have not yet been clarified. The aim of this study is to investigate whether exogenous supplement of NANA improves reverse cholesterol transprot (RCT) in vivo. Methods Apolipoprotein E-deficient mice fed a high-fat diet were used to investigate the effect of NANA on RCT by [3H]-cholesterol-loaded macrophages, and the underlying mechanism was further investigated by various molecular techniques using fenofibrate as a positive control. Results Our novel results demonstrated that exogenous supplement of NANA significantly improved [3H]-cholesterol transfer from [3H]-cholesterol-loaded macrophages to the plasma (an increase of > 42.9%), liver (an increase of 35.8%), and finally to the feces (an increase of 50.4% from 0 to 24 h) for excretion in apolipoprotein E-deficient mice fed a high-fat diet. In addition, NANA up regulated the protein expression of ATP-binding cassette (ABC) G1 and peroxisome proliferator-activated receptor α (PPARα), but not the protein expression of ABCA1and scavenger receptor B type 1 in the liver. Therefore, the underlying mechanism of NANA in improving RCT may be partially due to the elevated protein levels of PPARα and ABCG1. Conclusion Exogenous supplement of NANA improves RCT in apolipoprotein E-deficient mice fed a high-fat diet mainly by improving the protein expression of PPARα and ABCG1. These results are helpful in explaining the lipid-lowering effect of NANA

    Biointerface design for vertical nanoprobes

    No full text
    Biointerfaces mediate safe and efficient cell manipulation, which is essential for biomedical innovations in advanced therapies and diagnostics. The biointerface established by vertical nanoprobes — arrays of vertical high-aspect-ratio nanostructures — has emerged as a simple, controllable and powerful tool for interrogating and manipulating cells. Vertical nanoprobes have substantially improved our ability to control and characterize the intracellular environment, guide biophysical stimuli with nanoscale precision to defined cell compartments, stimulate and record the electrical activity of cells, and transport hard-to-deliver drugs. These capabilities are enabling substantial advances in bioelectronics, spatiotemporally resolved molecular diagnostics, and cell and gene therapy — all underpinned by the design versatility of the nanoprobe biointerface. This Review discusses how the design of a vertical nanoprobe biointerface determines its ability to interrogate and control a cell.Ministry of Education (MOE)Nanyang Technological UniversityNational Research Foundation (NRF)R.E. thanks the Australian government (ARC DECRA project number: DE170100021), the Melbourne Centre for Nanofabrication (MCN) in the Victorian Node of the Australian National Fabrication Facility (ANFF), the ANFF-Vic Tech Ambassador Program for Deakin University, Deakin’s School of Medicine and Deakin’s Institute of Frontier Materials. X.X. acknowledges financial support from the National Natural Science Foundation of China (grant no. 32171399) and National Key R&D Program of China (grant no. 2021YFF1200700, 2021YFA0911100). P.S. acknowledges support from the Hong Kong Centre for Cerebro-cardiovascular Health Engineering, funded by the Innovation and Technology Commission of Hong Kong. F.S. acknowledges the support of the European Research Council starting grant BRAIN-ACT no. 949478. C.C. acknowledges the support of the European Research Council starting grant ENBION no. 759577. Y.Z. acknowledges the support of the UK Department for Business, Energy, and Industrial Strategy through the National Measurement System (NMS project, Bioelectronics integrated multifunctional physiological measurement platform) and EPSRC Industrial CASE 2020 (20000128). W.Z. acknowledges the support of the Singapore Ministry of Education (MOE) (W.Z., RG112/20, NGF-2021-10-026 and MOET32020-0001), the Singapore National Research Foundation (W.Z., NRF2019-NRFISF003-3292), the Human Frontier Science Program (RGY0088/2021) and the NTU start-up grant. N.H.V. thanks the Australian Research Council for support under the Industrial Transformation Training Centre Scheme (IC170100016 and IC190100026)
    corecore